Learning Cross-cutting Systems of Categories
نویسندگان
چکیده
Most natural domains can be represented in multiple ways: animals may be thought of in terms of their taxonomic groupings or their ecological niches and foods may be thought of in terms of their nutritional content or social role. We present a computational framework that discovers multiple systems of categories given information about a domain of objects and their properties. Each system of object categories accounts for a distinct and coherent subset of the features. A first experiment shows that our CrossCat model predicts human learning in an artificial category learning task. A second experiment shows that the model discovers important structure in two real-world domains. Traditional models of categorization usually search for a single system of categories: we suggest that these models do not predict human performance in our task, and miss important structure in our real world examples. People explain different aspects of everyday objects in different ways. For example, steak is high in iron because it is a meat; however, it is often served with wine because it is a dinner food. The different ways of thinking about steak underscore different ways of thinking about the domain of foods: as a system of taxonomic categories like meats and vegetables, or as a system of situational categories like breakfast foods and dinner foods. If you were to plan meals for a family trip you would draw upon both of these systems of categories, consulting the taxonomy to insure that meals were nutritionally balanced and consulting the situational system to insure that there were foods that were appropriate for the different times of the day. In any domain, objects have different kinds of properties, and more than one system of categories is needed to explain the different relationships among objects in the domain. Psychologists have experimentally confirmed that multiple systems of categories are needed to account for human behavior. Ross and Murphy (1999) showed that subjects draw on at least two different kinds of knowledge to categorize and reason about foods: knowledge about taxonomic categories and knowledge about foods that tend to be eaten together. Similarly, studies have shown that animals may be thought about in terms of taxonomic categories such as mammals and reptiles, or ecological categories such as predators and prey. For example, reasoning about anatomical properties appears to draw on taxonomic categories, but reasoning about disease transmission may rely on ecological categories (see Heit and Rubinstein, 1994; Shafto and Coley, 2003; Shafto et al., 2005). Most previous models of categorization have attempted to discover a single system of categories within a given domain (but see Martin and Billman, 1994). This paper introduces CrossCat, a Bayesian framework for discovering multiple systems of categories — for example, discovering that foods can be organized into a system of taxonomic categories and a system of situational categories. A key feature of our approach is that we need not specify the number of systems of categories in advance, or the number of categories within each system: our model automatically discovers a representation of appropriate complexity. To test our model, we studied human performance in an unsupervised learning task, and analyzed the structure of two real-world datasets: foods and animals. Our model provides a good account of human performance, and captures intuitively compelling structures in both of our datasets. Of the previous models that search for a single system of categories, our approach is related most closely to Anderson’s rational analysis of categorization (Anderson, 1991). We compare our approach to this model throughout, and argue that models that rely on a single system of categories cannot provide an adequate account of human learning and reasoning. A generative model for learning systems of categories Assume we are provided with an array of objects and features — for example, the matrix of foods shown in Figure 1. Our goal is to organize the objects into one or more systems of categories, and to discover the features best explained by each system. A good solution for the food matrix is shown in Figure 2. There are two systems of categories: the first is a situational system partitioned into breakfast foods and dinner foods, and the second is a taxonomic system that includes starches, meats, and diary foods. Intuitively, the solution is a good one because each feature respects the structure of its associated category system; for example, “served with wine” discriminates perfectly between the situational categories, but is not as clean with respect to the taxonomic categories (half of the starches are served with wine and half are not). More formally, CrossCat takes as input a list of O objects, a list of F features, and an O by F data matrix
منابع مشابه
Learning Frames from Text with an Unsupervised Latent Variable Model
We develop a probabilistic latent-variable model to discover semantic frames—types of events or relations and their participants—from corpora. Our key contribution is a model in which (1) frames are latent categories that explain the linking of verb-subject-object triples in a given document context; and (2) cross-cutting semantic word classes are learned, shared across frames. We also introduc...
متن کاملZero-Shot Sketch-Image Hashing
Recent studies show that large-scale sketch-based image retrieval (SBIR) can be efficiently tackled by cross-modal binary representation learning methods, where Hamming distance matching significantly speeds up the process of similarity search. Providing training and test data subjected to a fixed set of pre-defined categories, the cutting-edge SBIR and cross-modal hashing works obtain acceptab...
متن کاملMothers' Interpretation of E-learning during COVID-19 Pandemic
The global spread of COVID-19 has caused disruptions in many aspects of our lives. Education systems worldwide have changed dramatically. Numerous countries have encouraged schools to shift to e-learning and, as a result, parental involvement in their children’s education has changed. This study focused on mothers’ involvement in children’s education during the COVID-19 pandemic and probed thei...
متن کاملThe cross-linguistic categorization of everyday events: a study of cutting and breaking.
The cross-linguistic investigation of semantic categories has a long history, spanning many disciplines and covering many domains. But the extent to which semantic categories are universal or language-specific remains highly controversial. Focusing on the domain of events involving material destruction ("cutting and breaking" events, for short), this study investigates how speakers of different...
متن کاملMechanisms of Learning through the Hidden Curriculum in the Perspective of Iranian Nursing Students
Background: Human beings use different approaches to learning. Thus the aim of this study is to explore the mechanisms of learning through the hidden curriculum in the perspective of baccalaureate undergraduate nursing students. Methods: This qualitative study, involving 24 baccalaureate undergraduate nursing students, was done by purposeful sampling strategies. The data were collected using se...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006